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Abstract 

We review some existing methods for the computation of first order moments on 
junction trees using Shafer-Shenoy algorithm. First, we consider the problem of first order 
' moments computation as vertices problem in junction trees. In this way, the problem is 

solved using the memory space of an order of the junction tree edge-set cardinality. After 
that, we consider two algorithms, Lauritzen-Nilsson algorithm, and Maua et al. algorithm, 
which computes the first order moments as the normalization problem in junction tree, 
using the memory space of an order of the junction tree leaf-set cardinality. 



1 Shafer-Shenoy algorithm 



C/3 



1.1 Potentials and operations 



Let X = (X u : u G U) be a finite collection of discrete random variables. Let £l u denotes the 
set of possible values that X u can take. For A C U we write Qa for the Cartesian product 
x ueA tt u and write Xa for {X u : u G A} 0. The elements of Vl A , A C U are denoted xa and 
called the configuration. We adopt the convention that consists of a single element x$ = o, 
i.e. ^0 = {o}. 

Let Uu be a set of functions on tta '■ &a — > K-, where A C U, i.e. Ujj = {ir A ■ -> 
/C | A C [/}. In the following text, the functions from Uu are called potentials. Let (g) be 
a binary operation on K, called combination and let ^ denotes the external operation called 
marginalization which to every tca '■ &>a associates tt a b : Qahb fc, where A, B G U. 
We assume that the following Shafer-Shenoy axioms hold for combination and marginaliza- 



X 

H 

tion. 



1.2 Shafer-Shenoy axioms 

Axiom 1 (Commutativity and Associativity) Let tta, ttb and nc be potentials. Then 

ka ® k b = n B <g n A and n A g) (ir B g) n c ) = (n A ® ^b) ® ^o- (1) 

Axiom 1 allows us to use the notation -a a ® ^b <g> ftc- 
Axiom 2 (Consonance) Let % A be a potential on A, and let A D B D C. Then 

(Trf r = nf. (2) 



1 We implicitly assume some natural ordering in sets 
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Axiom 3 (Distributivity) Let tta and ttb be potentials on A and B, respectively. Then 

(n A (8) n B ) lA = n A ® n l B A . (3) 



1.3 Junction Tree 

The joint potential ttu '■ — >■ /C is said factorize on T with respect to ® if there exists 
potentials 7Ty : fly — >• /C for V G V, so that we can write 7r as 

7rc/ = (g)7r y . (4) 

vev 

In this paper we consider the joint potentials which can be represented with junction tree which 
is defined as follows. 

Definition 1 Let V — {Vi, V 2 , . . . , V n } be a collection of subsets of U (set of variable indices) 
and T a tree with V as its node set (corresponds to a set of local domains). Then T is said to 
be a junction tree if any intersection ViP\Vj of a pair Vi, Vj of sets in V is contained in every 
node on the unique path in T between Vi and Vj. Equivalently, for any u G U , the set of subsets 
in V containing u induces a connected subtree of T. 

The set of all neighbors of A is denoted ne(A). We omit the parentheses from the notation 
when it is not prone to misunderstanding. Hence, ne(A) \ B stands for the set of all neighbors 
of A without B, ne(A) \ B,C for the set of all neighbors of A without B and C and so on. 
Vi ~ V i+ i denotes that Vi and V i+ i are neighbors. 

A junction tree is usually drawn with sets V, as node labels. In the following text the node 
will be identified with the label. The general procedure for the junction tree building can be 
found in [Tj and [2]. An example of the junction tree which corresponds to chain factorization, 

n 

nil = Try,, Vi ~ V i+ i for i = 1, . . . ,n - 1, (5) 

i=l 

is given in Fig. 1. 




Fig. 1. The junction tree for chain factorization ttu = <SC=i ^» ~ 



1.4 Problems 

The junction tree enables the solution of three important problems: 

1. The single vertex problem at node A is defined as the computation of the potential if} a '■ 
Qa — > fc, defined by 

ip A = 4 A = , (6) 

vev 
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2. The all vertices problem is defined as the computation of the functions ipA for all A e V; 



The normalization problem is the marginalization of the joint potential fll]) to the empty 
set 0. Using the consonance of the marginalization (|2J), it can straightforwardly be solved 
by the solution of the single vertex problem in arbitrary node A: 



7T; 



u 



(4T = < (7) 



1.5 Local computation algorithm 

These problems can efficiently be solved with the Shafer-Shenoy local computation algorithm 
(LCA). The algorithm can be described as passing the messages over the edges and processing 
the messages in the nodes of the junction tree. 

Messages are passed between the vertexes from V via mailboxes. All mailboxes are initialized 
as empty. When a message has been placed in a mailbox, the box is full. A node A in the 
junction tree is allowed to send a message to its neighbor B if it has not done so before and 
if all A-incoming mailboxes are full except possibly the one which is for S-outgoing messages. 
So, initially only leaves (nodes which have only one neighbor) of the junction tree are allowed 
to send messages. But as the message passing proceeds, other nodes will have their turn and 
eventually all mailboxes will be full, i.e., exactly two messages will have been passed along each 
branch of the junction tree. 

The message form A to B is a function i:a-+b '■ ^adb — > fC. The passage of a message tia-^b 
from node A to node B is performed by absorption. Absorption from clique A to clique B 
involves eliminating the variables A\B from the potentials associated with A and its neighbors 
except B. The structure of the message ka-^b is given by 

\.B 



K A ^B = [ k a » { 

C&ne(A)\B 



k c ^a)) ■ (8) 



where ttc^a is the message passed from C to A. Since the leaves has only one neighbor, the 
product on the righthand side is empty and the message can initially be computed as 

TTA^B = TTf. (9) 

Suppose we start with a joint potential tx v on a junction tree T, and pass messages towards 
a root clique R as described above. When R has received a message from each of its neighbors, 
the combination of all messages with its own potential is equal to a decomposition of the 
R- marginal of %u- 

4^ = ( ® K V y R = K R ® ( (g) 7T V ^ . (10) 
VeV Vene(R) 

where V is vertex-set in T. 

Hence, if we want to solve the single vertex problem at node A, we need to compute all 
messages incoming to A, while for the all vertices problem we need the messages between all 
pairs of nodes in the tree. 

For the single vertex problem, the algorithm starts at the leaves which send the messages to 
their neighbors. A node sends a message to a neighbor, once it has received messages from each 
of its other neighbors. The node A never sends a message. Thus, each message is sent only once 
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until A has received the messages from all the neighbors at which point the required marginal 
is computed and the algorithm terminates with the total number of computed messages equal 
to the number of edges of the tree. Once we have solved the single vertex problem in the node 
A, the normalization problem can be solved with ([7]). 

The first part of the algorithm for all vertices problem is similar to the single vertex case. 
The messages are sent from leaves toward the tree until a node C has received the messages 
from all the neighbors. After that the messages are sent from C to the leaves. The algorithm 
stops when all leaves receive messages. The total number of computed messages is equal two 
times the number of edges in the tree (for any two nodes A and B we send the messages ha^b 
and hb^a)- 

2 First order moments 

2.1 Operations on the set of functions 

For real- valued functions 4>a '■ &a — >• R an d 4>b '■ — > R the sum, 4>a + <Pb '■ ^aub R and 
the product, 4>a- 4>b '■ ^aub — > R, are respectively defined with: 

{4> A + 4>b)(x A ub) = ^a{xa) + 4>b{xb) (11) 
(4>a • <Pb) (x A ub) = <Pa{xa) ■ 4>b{xb) (12) 

for all xa £ &a and xb € £Ib- The product 4>a- 4>b is, usually, shortly denoted with 4>a4>b- 

We define sum-marginal operator A f° r A C C, which to every real-valued function 
(fic '■ R associates the function ^2 Xc A ■ ~ ► R> defined with 

(J2<Pc)(xa)= tc(x c ) (13) 

x c\a x C \ A en Xc ^ A 

and the marginalization is defined with 

X C\A 

2.2 Definition of first order moments 

The joint probability non-negative function of random variable Xu, p '■ flu ~ ► R is said to 
factorize multiplicatively on T if there exists non-negative real functions pc '■ flc — > R for 
C G V, so that we can write p(xjj) as 

Pu = II Pc > ( 15 ) 

cev 

Similarly, the function h : flu — >■ R is said to factorize additively on T if there exists real 
functions he '■ flc R for (C G V), so that we can write h(xjj) as 

hu = J2 h c ( 16 ) 

cev 



4 



The first order moment potential, rric ■ &c — > R ; is defined with 

m c = Y Pu ' hu. (17) 

x u\c 

In the case C = 0, the first order moment potential is simply denoted m, 

m = S ^p u ■ h v . (18) 

Xu 

and called the first order moment. 

Example 1 The first order moment potential may be useful for expressing the conditional 
expectation 

^[huiX^xc] = Y,P( X u\xc)h{X uxc ,x c ) (19) 

x u\c 

for C G V. After usage of 

(v I \ p(Xu\c,x c ) p(Xu\ c ,xc) , . 

P( X C) l^x uxc P( X U\C,X C ) 

wg have 

l^x uxa P{ A u\c,xc) p\j (X C ) 

Consequently, unconditioned expectation equals the first order moment 

E[h(X v )] = Y^pMKxu) = m. (22) 

Xu 

2.3 The problem of first order moments computation as all vertices 
problem 

The computation of (j!8p by enumerating all configurations would require an exponential number 
of operations with respect to the cardinality of Qu- However, the computational complexity 
can be reduced using the local computation algorithm which exploits structure of functions 
given with factorizations f|T5|) and f|T6|) . In this case, the marginal values pff are computed 
for all C G V using the local computation over the set of real-valued functions. After that the 
moment is computed according to equality 

m c = J2J2 h cPu> ( 23 ) 

cev x c 

which follows from 

m c = Y Pu K = Y Pu Y h ° = Y Y Pu h ° = Y Y h ° Y Pu = Y Y h ° pi u- ( 24 ) 

x v x v CeV CeV x v CeV x c x U \ C CeV x c 

I C 1 

This method requires the storing of marginal values p v for all C G V, which unnecessary 
increases the memory complexity. Instead, we can use the local computation algorithms by 
Lauritzen and Nilsson [3] and Maua et al. [I], which find the moment as the solution of the 
normalization problem. In the following section, we consider these two algorithms. 
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3 First order moments computation using order pair po- 
tential algorithms 

3.1 Order pair potentials 

In our local computation algorithms we represent the quantitative elements through entities 
called potentials. Each such potential has two parts, as detailed below. 

Definition 2 (Potential) A potential on C C U is a pair 7rc=(pc, he) of real-valued func- 
tions on flc; where pc is nonnegative. 

Thus, a potential consists of two parts - p-part and /i-part. We call a potential tcq vacuous, 
if 7Tc = (1,0). We identify two potentials 7r!v = (p9,\h9?) and 717? = (pw ,hr) on C and 
write J L ig'ifpg) = pg'and fc«(. c ) = h?( X c) whenever 



i.e., two potentials are considered equal if they have identical probability parts and their utility 
parts agree almost surely with respect to the probability parts. 

To represent and evaluate the decision problem in terms of potentials, we define basic 
operations of combination and marginalization. There are two possible ways to define the 
operations. 

1. Lauritzen-Nilsson algorithm [3J 

2. Maua et al. algorithm [I] 

3.2 Lauritzen-Nilsson algorithm 

Definition 3 (Combination) The combination of two potentials tta = {pA,h>A) and ttb = 
{.Pb, hs) denotes the potential on AU B given by 



Definition 4 (Marginalization) The marginalization of ixc — (pc,hc) onto A C C G V is 
defined by 



Here we have used the convention that 0/0 which will be used throughout. 

As shown in Lauritzen and Nilsson [3], the operations of combination and marginalization 
satisfy the properties of Shenoy and Shafer axioms [5], and three structured factorizations can 
be marginalized using the Shafer- Shenoy algorithm. 

If the operations are defined in this way and the potentials are set to, 



p { c ) (x c )=p ( g\x c )>0, 



(25) 



ka®k b = (pa ■ Pb, h A + h B ). 



(26) 




(27) 



'c = 



(Pc , h c ) 



(28) 
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and the factorizations ffT5j) and (TT6]) hold, then 



vrc/ = (g)7r c = (g)(p c ,/i c )= (JI^'EM =(Pu,hu)- (29) 
cgv cev cev cev 



Accordingly, we have 



where we have used probability condition pt/ = 1. Hence, the first order moment potential 
can be computed using the Shafer-Shenoy local computation algorithm, where the combination 
and marginalization are defined with (I26j) - (j27j) . The messages have the form: 

TTA^B = (n ( j\ B , 71%\ B ) (31) 

where the p and h part are given with 

(p) -e ^ n (32) 

%a\b CEne(A)\B 



E PA n *$l A -(hA+ E 

(h) _ x a\b C&ne{A)\B v C&ne{A)\B / 

^A^B ~ ( ) (™) 

E PA EI C-^A 

%a\b Cene(A)\B 

which follows from equations (IB]). (|26|), (|2Tj), (|28|) and (j43|) . 

Example 2 Let ttu has the chain factorization 

n 

= 6§ n Vi , Vi ~ V i+1 fori = l,...,n-l, (34) 



i=l 



and let iTi-tu+i) stands as shorthand for the message ny i _^v i+1 ■ According to chain factorization 
ne(Vi) \ Vi+i = {Vi-i}, p and h parts of the message reduce to: 



(p) 

7T 



E Pv t -(hv t + ) 



7r i^a+i) ~ ^ ( P ) 



3.3 Maua et al. algorithm 

Definition 5 (Combination) Let tta = {pa-, ^a) one? 7Tb = (pbj^b) iu>o potentials on A 
and B, respectively. The combination 7r^(g) 7i\b o/^a ^b potential on AU B given by 

tca <8> ttb = (pa£>b , ^aPb + Pa^b)- (37) 
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Definition 6 (Marginalization) Let tic = (pc, he) be a potential on C, and let AC C. The 

marginalization of ttc onto A is the potential on A given by 

*#=(£*to>£M- (38) 

X C\A X C\A 

The following lemma can be proven by induction [6]. 
Lemma 1 Let J\f C V and ha = (^a , ^a) ^ e or( ^ er P a i r potentials for A G M . Then, 

<S)-c=<S) 4 } ) = ( n * e -a n *b) ■ (39) 

CeM CeAf AeAf AeAf BeAf\A 

If the operations are defined in this way and the potentials are set to, 

ttc = (pc , Pch c ) (40) 
and the factorizations fTTol) and (TT6l) hold, then 

ttc; = (^)7r c = ( Y[pa, Y[pA^2h B ) = (pu,Puhu)- (41) 
cev Aev Aev Bev 

Accordingly, we have 

4 = (£^,£i^ hu ) = ( x ' m )' ( 42 ) 

where we have used probability condition Pt/ = 1- Again, the first order moment potential 
can be computed using the Shafer-Shenoy local computation algorithm, where the combination 
and marginalization are defined with ( 157)1 and (158)1 . Like in the Lauritzen-Nilsson algorithm, 
the messages have the form: 

*A-*B = {irH B ,ir$}> B ) (43) 
but now, according to [5H1 the p-part and the /i-part of the messages are given with 

- { Lb = E PA II ( 44 ) 

%a\b C€ne(A)\B 

^B=E^-( n + e ^a n (45) 

x a\b C£ne(A)\B Cene(A)\B Dene(A)\B,C 

Note that the p-parts of the Lauritzen-Nilsson algorithm and the Maua et al. algorithm are 
the same. For the trees with large average degree, the /i-parts of messages are more complex 
in Maua et al. algorithm, due to repeated multiplications in products in the equality (145)1 . 
However, Maua et al. algorithm is simpler for chains as the following example shows. 

Example 3 Let njj has the chain factorization 

n 

7T[/ = 7iy. , Vi ~ V i+1 for i = 1, . . . , n - 1, (46) 

-i=i 
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and let Tr^u+u stands as shorthand for the message TTv t ^v i+1 ■ According to chain factorization 
ne(Vi) \ V i+ \ = {Vi^i}, p and h parts of the message reduce to: 

X Vi\V i+1 

X Vi\V l+1 

4 Conclusion 

We reviewed some existing methods for the computation of first order moments on junction 
trees using Shafer-Shenoy algorithm. First, we consider the problem of first order moments 
computation as vertices problem in junction trees. In this way, the problem is solved using the 
memory space of an order of the junction tree edge-set cardinality. After that, we considered 
two algorithms, Lauritzen-Nilsson algorithm, and Maua et al. algorithm, which computes the 
first order moments as the normalization problem in junction tree, using the memory space of 
an order of the junction tree leaf-set cardinality. It is shown, that for trees, the first of them has 
simpler formulas in comparison to the second one, while the second one is simpler for chains. 

References 

[1] S M Aji and R J McEliece. The generalized distributive law. IEEE Transactions on 
Information Theory, 46(2):325-343, 2000. 

[2] Robert G. Cowell, A. Philip Dawid, Steffen L. Lauritzen, and David J. Spiegelhalter. Prob- 
abilistic Networks and Expert Systems. Springer, 1999. 

[3] Steffen L. Lauritzen and Dennis Nilsson. Representing and Solving Decision Problems with 
Limited Information. Manage. Sci., 47(9):1235-1251, 2001. 

[4] Denis Deratani Maua, Cassio Polpo de Campos, and Marco Zaffalon. Solving limited mem- 
ory influence diagrams. CoRR, abs/1109.1754, 2011. 

[5] Prakash P. Shenoy and Glenn Shafer. Axioms for probability and belief-function propaga- 
tion. In Uncertainty in Artificial Intelligence, pages 169-198. Morgan Kaufmann, 1990. 

[6] V.M. Ilic and, M.S. Stankovic and, and B.T. Todorovic and. Entropy message passing. 
Information Theory, IEEE Transactions on, 57(1):375 -380, jan. 2011. 



9 



